智能论文笔记

Quantifying How Hateful Communities Radicalize Online Users

Matheus Schmitz , Keith Burghardt , Goran Muric

分类：自然语言处理 | 机器学习

2022-09-19

虽然在线社交媒体提供了一种忽略或窒息的声音的方式，但它还使用户可以平台传播可恨的言论。这种讲话通常起源于边缘社区，但它可以溢出到主流渠道中。在本文中，我们衡量加入边缘仇恨社区的影响，以仇恨言论传播到社交网络的其余部分。我们利用Reddit的数据来评估加入一种回声室的效果：一个志趣相投的用户，表现出仇恨行为的数字社区。我们在成为积极参与者之前和之后衡量成员在研究社区之外的仇恨言论的用法。使用中断的时间序列（ITS）分析作为因果推理方法，我们衡量了溢出效应，其中某个社区内的可恨语言可以通过使用社区外的仇恨单词用作代理，可以通过使用社区的层次来传播该社区之外的效果对于博学的仇恨。我们研究了涵盖仇恨言论的三个领域的四个不同的Reddit子社区（子红）：种族主义，厌女症和脂肪欺骗。在所有三种情况下，我们发现在原始社区之外的仇恨言论都在增加，这意味着加入此类社区会导致仇恨言论在整个平台中传播。此外，在最初加入社区后的几个月后，发现用户可以在几个月内接受这种新的仇恨演讲。我们表明，有害的言论不保留在社区中。我们的结果提供了回声室有害影响的新证据，以及调节它们以减少仇恨言论的潜在好处。

translated by 谷歌翻译

Inferring topological transitions in pattern-forming processes with self-supervised learning

Marcin Abram , Keith Burghardt , Greg Ver Steeg , Aram Galstyan , Remi Dingreville

分类：计算机视觉 | 机器学习

2022-03-19

模式形成过程中拓扑和微观结构方案中过渡的识别和分类对于理解和制造许多应用领域中的微观结构精确的新型材料至关重要。不幸的是，相关的微观结构过渡可能取决于以微妙而复杂的方式取决于过程参数，而经典相变理论未捕获。尽管有监督的机器学习方法可能对识别过渡制度很有用，但他们需要标签，这些标签需要先验了解订单参数或描述这些过渡的相关结构。由动态系统的通用原理的激励，我们使用一种自我监督的方法来解决使用神经网络从观察到的微观结构中预测过程参数的反问题。这种方法不需要关于不同类别的微观结构模式或预测微观结构过渡的目标任务的预定义的，标记的数据。我们表明，执行逆问题预测任务的困难与发现微观结构制度的目标有关，因为微观结构模式的定性变化与我们自我监督问题的不确定性预测的变化相对应。我们通过在两个不同的模式形成过程中自动发现微观结构方案中的过渡来证明我们的方法的价值：两相混合物的旋律分解以及在薄膜物理蒸气沉积过程中二进制合金浓度调制的形成。这种方法为发现和理解看不见的或难以辨认的过渡制度开辟了一个有希望的途径，并最终用于控制复杂的模式形成过程。

translated by 谷歌翻译

Detecing Anti-Vaccine Users on Twitter

Matheus Schmitz , Goran Murić , Keith Burghardt

分类：自然语言处理

2021-10-21

最近受到在线叙述驱动的疫苗犹豫会大大降低了疫苗接种策略的功效，例如Covid-19。尽管医学界对可用疫苗的安全性和有效性达成了广泛的共识，但许多社交媒体使用者仍被有关疫苗的虚假信息淹没，并且柔和或不愿意接种疫苗。这项研究的目的是通过开发能够自动识别负责传播反疫苗叙事的用户的系统来更好地理解反疫苗情绪。我们引入了一个公开可用的Python软件包，能够分析Twitter配置文件，以评估该个人资料将来分享反疫苗情绪的可能性。该软件包是使用文本嵌入方法，神经网络和自动数据集生成的，并接受了数百万条推文培训。我们发现，该模型可以准确地检测出抗疫苗用户，直到他们推文抗Vaccine主题标签或关键字。我们还展示了文本分析如何通过检测Twitter和常规用户之间的抗疫苗传播器之间的道德和情感差异来帮助我们理解反疫苗讨论的示例。我们的结果将帮助研究人员和政策制定者了解用户如何成为反疫苗感以及他们在Twitter上讨论的内容。政策制定者可以利用此信息进行更好的针对性的运动，以揭露有害的反疫苗接种神话。

translated by 谷歌翻译

HYRR: Hybrid Infused Reranking for Passage Retrieval

Jing Lu , Keith Hall , Ji Ma , Jianmo Ni

分类：自然语言处理

2022-12-20

We present Hybrid Infused Reranking for Passages Retrieval (HYRR), a framework for training rerankers based on a hybrid of BM25 and neural retrieval models. Retrievers based on hybrid models have been shown to outperform both BM25 and neural models alone. Our approach exploits this improved performance when training a reranker, leading to a robust reranking model. The reranker, a cross-attention neural model, is shown to be robust to different first-stage retrieval systems, achieving better performance than rerankers simply trained upon the first-stage retrievers in the multi-stage systems. We present evaluations on a supervised passage retrieval task using MS MARCO and zero-shot retrieval tasks using BEIR. The empirical results show strong performance on both evaluations.

translated by 谷歌翻译

Words as Gatekeepers: Measuring Discipline-specific Terms and Meanings in Scholarly Publications

Li Lucy , Jesse Dodge , David Bamman , Katherine A. Keith

分类：自然语言处理

2022-12-19

Scholarly text is often laden with jargon, or specialized language that divides disciplines. We extend past work that characterizes science at the level of word types, by using BERT-based word sense induction to find additional words that are widespread but overloaded with different uses across fields. We define scholarly jargon as discipline-specific word types and senses, and estimate its prevalence across hundreds of fields using interpretable, information-theoretic metrics. We demonstrate the utility of our approach for science of science and computational sociolinguistics by highlighting two key social implications. First, we measure audience design, and find that most fields reduce jargon when publishing in general-purpose journals, but some do so more than others. Second, though jargon has varying correlation with articles' citation rates within fields, it nearly always impedes interdisciplinary impact. Broadly, our measurements can inform ways in which language could be revised to serve as a bridge rather than a barrier in science.

translated by 谷歌翻译

NusaCrowd: Open Source Initiative for Indonesian NLP Resources

Samuel Cahyawijaya , Holy Lovenia , Alham Fikri Aji , Genta Indra Winata , Bryan Wilie , Rahmad Mahendra , Christian Wibisono , Ade Romadhony , Karissa Vincentio , Fajri Koto

分类：自然语言处理 | 人工智能

2022-12-19

We present NusaCrowd, a collaborative initiative to collect and unite existing resources for Indonesian languages, including opening access to previously non-public resources. Through this initiative, we have has brought together 137 datasets and 117 standardized data loaders. The quality of the datasets has been assessed manually and automatically, and their effectiveness has been demonstrated in multiple experiments. NusaCrowd's data collection enables the creation of the first zero-shot benchmarks for natural language understanding and generation in Indonesian and its local languages. Furthermore, NusaCrowd brings the creation of the first multilingual automatic speech recognition benchmark in Indonesian and its local languages. Our work is intended to help advance natural language processing research in under-represented languages.

translated by 谷歌翻译

Emergent Analogical Reasoning in Large Language Models

Taylor Webb , Keith J. Holyoak , Hongjing Lu

分类：人工智能 | 自然语言处理

2022-12-19

The recent advent of large language models - large neural networks trained on a simple predictive objective over a massive corpus of natural language - has reinvigorated debate over whether human cognitive capacities might emerge in such generic models given sufficient training data. Of particular interest is the ability of these models to reason about novel problems zero-shot, without any direct training on those problems. In human cognition, this capacity is closely tied to an ability to reason by analogy. Here, we performed a direct comparison between human reasoners and a large language model (GPT-3) on a range of analogical tasks, including a novel text-based matrix reasoning task closely modeled on Raven's Progressive Matrices. We found that GPT-3 displayed a surprisingly strong capacity for abstract pattern induction, matching or even surpassing human capabilities in most settings. Our results indicate that large language models such as GPT-3 have acquired an emergent ability to find zero-shot solutions to a broad range of analogy problems.

translated by 谷歌翻译

Predicting Autonomous Vehicle Collision Injury Severity Levels for Ethical Decision Making and Path Planning

James E. Pickering , Keith J. Burnham

分类：人工智能

2022-12-16

Developments in autonomous vehicles (AVs) are rapidly advancing and will in the next 20 years become a central part to our society. However, especially in the early stages of deployment, there is expected to be incidents involving AVs. In the event of AV incidents, decisions will need to be made that require ethical decisions, e.g., deciding between colliding into a group of pedestrians or a rigid barrier. For an AV to undertake such ethical decision making and path planning, simulation models of the situation will be required that are used in real-time on-board the AV. These models will enable path planning and ethical decision making to be undertaken based on predetermined collision injury severity levels. In this research, models are developed for the path planning and ethical decision making that predetermine knowledge regarding the possible collision injury severities, i.e., peak deformation of the AV colliding into the rigid barrier or the impact velocity of the AV colliding into a pedestrian. Based on such knowledge and using fuzzy logic, a novel nonlinear weighted utility cost function for the collision injury severity levels is developed. This allows the model-based predicted collision outcomes arising from AV peak deformation and AV-pedestrian impact velocity to be examined separately via weighted utility cost functions with a common structure. The general form of the weighted utility cost function exploits a fuzzy sets approach, thus allowing common utility costs from the two separate utility cost functions to be meaningfully compared. A decision-making algorithm, which makes use of a utilitarian ethical approach, ensures that the AV will always steer onto the path which represents the lowest injury severity level, hence utility cost to society.

translated by 谷歌翻译

Residual Policy Learning for Powertrain Control

Lindsey Kerbel , Beshah Ayalew , Andrej Ivanco , Keith Loiselle

分类：人工智能 | 机器学习

2022-12-15

Eco-driving strategies have been shown to provide significant reductions in fuel consumption. This paper outlines an active driver assistance approach that uses a residual policy learning (RPL) agent trained to provide residual actions to default power train controllers while balancing fuel consumption against other driver-accommodation objectives. Using previous experiences, our RPL agent learns improved traction torque and gear shifting residual policies to adapt the operation of the powertrain to variations and uncertainties in the environment. For comparison, we consider a traditional reinforcement learning (RL) agent trained from scratch. Both agents employ the off-policy Maximum A Posteriori Policy Optimization algorithm with an actor-critic architecture. By implementing on a simulated commercial vehicle in various car-following scenarios, we find that the RPL agent quickly learns significantly improved policies compared to a baseline source policy but in some measures not as good as those eventually possible with the RL agent trained from scratch.

translated by 谷歌翻译

Driver Assistance Eco-driving and Transmission Control with Deep Reinforcement Learning

Lindsey Kerbel , Beshah Ayalew , Andrej Ivanco , Keith Loiselle

分类：人工智能 | 机器学习

2022-12-15

With the growing need to reduce energy consumption and greenhouse gas emissions, Eco-driving strategies provide a significant opportunity for additional fuel savings on top of other technological solutions being pursued in the transportation sector. In this paper, a model-free deep reinforcement learning (RL) control agent is proposed for active Eco-driving assistance that trades-off fuel consumption against other driver-accommodation objectives, and learns optimal traction torque and transmission shifting policies from experience. The training scheme for the proposed RL agent uses an off-policy actor-critic architecture that iteratively does policy evaluation with a multi-step return and policy improvement with the maximum posteriori policy optimization algorithm for hybrid action spaces. The proposed Eco-driving RL agent is implemented on a commercial vehicle in car following traffic. It shows superior performance in minimizing fuel consumption compared to a baseline controller that has full knowledge of fuel-efficiency tables.

translated by 谷歌翻译